专利摘要:
the techniques described in this specification refer to computer-readable methods, devices and means configured to determine motion vectors. the techniques refer to encoders and decoders. for example, a decoder receives compressed video data related to a set of frames. the decoder calculates, using a decoder side predictor refinement technique, a new motion vector for a current frame in the frameset, where the new motion vector estimates motion for the current frame based on one or more reference frames. the calculation includes retrieving a first motion vector associated with the current frame, performing a first portion of the decoding process using the first motion vector, recovering a second motion vector associated with the current frame other than the first motion vector and executing a second portion decoding process using the second motion vector.
公开号:BR112019013832A2
申请号:R112019013832
申请日:2018-01-05
公开日:2020-01-28
发明作者:Chen Ching-Yeh;Chuang Tzu-Der
申请人:Mediatek Inc;
IPC主号:
专利说明:

DECODER SIDE MOTION VECTOR RESTORATION FOR VIDEO CODING
RELATED REQUESTS [001] This Order claims priority under 35 USC § 119 (e) for US Provisional Order Serial No. 62 / 442.472, entitled MOTION VECTOR RESTORATION METHODS FOR REFINING A DECODER SIDE PREDITOR filed on 5 January 2017, and US Provisional Order Serial No. 62 / 479,350, entitled MOVEMENT VECTOR RESTORATION METHODS FOR DECODER SIDE PREDITOR REFINING, filed March 31, 2017, which are incorporated into this specification by reference in its entirety.
TECHNICAL FIELD [002] The techniques described in this specification refer generally to video encoding and, in particular, to decoder side motion vector restoration.
BACKGROUND OF THE INVENTION [003] Video encoding involves the compression (and decompression) of a digital video signal. Examples of video encoding standards include the H.264 video compression standard and its successor High Efficiency Video Encoding (HEVC). Moving video is formed by taking captures of the signal at periodic time intervals, so that reproducing the series of captures or frames produces the appearance of movement. Video encoders include a forecasting model that attempts to reduce redundancy using similarities between neighboring video frames. A predicted framework is created from one or more
Petition 870190062098, of 7/3/2019, p. 12/25
2/59 previous or future frames that are generally called reference frames. Frames that do not serve as reference frames are generally called non-reference frames.
[004] Since each frame can include thousands or millions of pixels, video encoding techniques generally do not process all the pixels in a frame at once. Therefore, an encoded frame is divided into blocks that are often called macroblocks. Instead of directly encoding the raw pixel values for each block, the encoder tries to find a block similar to what it is encoding in a frame of reference. If the encoder finds a similar block, the encoder can encode that block using a motion vector, which is a two-dimensional vector that points to the corresponding block in the frame of reference.
[005] Some techniques explicitly signal movement information to the decoder. Examples of such modes include the blending mode and the Advanced Motion Vector Prediction (AMVP) mode in High Efficiency Video Coding (HEVC). However, having to signal motion vectors can consume a significant amount of data that could otherwise be used by the transmitter to encode other information. Therefore, decoder side motion vector refinement tools can be used to refine, predict and / or generate motion information in such a way that motion information can be derived without being explicitly signaled.
SUMMARY OF THE INVENTION
Petition 870190062098, of 7/3/2019, p. 12/26
3/59 [006] According to the disclosed subject, apparatus, systems and methods are provided for decoder side motion vector restoration techniques that improve the execution speed and efficiency of motion vector refinement techniques. decoder side.
[007] Some modalities refer to a decoding method for decoding video data. The method includes receiving compressed video data related to a frameset and calculating, using a decoder side predictor refinement technique, a new motion vector for a current frame of the frameset, in which the new motion vector estimates the movement for the current frame based on one or more reference frames. The calculation includes retrieving a first motion vector associated with the current frame, performing a first portion of the decoding process using the first motion vector, recovering a second motion vector associated with the current frame that is different from the first motion vector, and executing a second portion of the decoding process using the second motion vector.
[008] In some examples, the first motion vector comprises an unrefined motion vector, the second motion vector comprises a refined motion vector, where the refined MV is refined using a decoder side predictor refinement technique, the first portion of the decoding process comprises an analysis portion, a motion vector derivation portion, or both, and the second portion of the decoding process comprises a reconstruction portion.
Petition 870190062098, of 7/3/2019, p. 12/27
4/59 [009] In some examples, the decoding method includes retrieving a third motion vector associated with a second frame, where the third motion vector is a refined motion vector, performing the first portion of the decoding process using the first motion vector and the third motion vector, and perform the second portion of the decoding process using the second motion vector and the third motion vector.
[0010] In some examples, performing the first portion of the decoding process comprises executing a motion vector derivation portion using the first motion vector and the third motion vector, wherein the motion vector derivation portion comprises motion vector prediction derivation, merge candidate derivation, or both.
[0011] In some examples, the execution of the first portion of the decoding process comprises referring to the first motion vector as a decoded motion vector of the current frame.
[0012] In some examples, the decoding method includes using the second motion vector and the third motion vector to perform motion compensation, overlapping block motion compensation, unlocking or any combination thereof.
[0013] In some examples, the decoding method includes determining that an encoding tree unit constraint is not applied to the compressed video data, and retrieving the first motion vector associated with the current frame includes retrieving a motion vector not refined current picture, and a refined motion vector
Petition 870190062098, of 7/3/2019, p. 12/28
5/59 associated with a second frame.
[0014] In some examples, retrieving the first motion vector associated with the current frame includes retrieving an unrefined motion vector from a current coding tree unit line, a refined motion vector from a coding tree unit line top, another piece, or another slice, and a refined motion vector associated with a second frame.
[0015] Some modalities refer to a decoding method for decoding video data. The method includes receiving compressed video data related to a frameset and calculating, using a decoder side predictor refinement technique, a new motion vector for a current frame of the frameset, in which the new motion vector estimates the movement for the current frame based on one or more reference frames. The calculation includes receiving a signal indicative of an initial candidate index for a list of initial motion vector candidates, determining a first motion vector candidate in the initial motion vector candidate list and a second motion vector candidate comprising a difference that is below a predetermined threshold, remove the second motion vector candidate from the initial motion vector candidate list, do not add the second motion vector candidate
vector of movement The list in candidates in vector in movement initial, or both, and calculate the new vector in movement based at list in candidates and at the index in candidate initial.
[0016] In some examples, the decoding method
Petition 870190062098, of 7/3/2019, p. 12/29
6/59 includes analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors, determining, based on the analysis, that the pair of motion vectors is on the same motion path, and adding the pair of motion vectors to the initial motion vector candidate list.
[0017] In some examples, the decoding method includes analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors, determining, based on the analysis, that the motion vector pair does not is along the same
trajectory in movement, to separate the pair of vectors movement in two new pairs of vectors in movement candidates, and add both vectors in movement candidates The list of candidates of vector in movement
initial.
[0018] In some examples, the separation includes adding the first motion vector of the motion vector pair to a first of the two new candidate motion vector pairs, filling the first of the two new candidate motion vector pairs with a vector mirrored motion from the first motion vector, add the second motion vector from the motion vector pair to a second of the two new candidate motion vector pairs, and fill in the second two new candidate motion vector pairs with a motion vector mirrored motion of the second motion vector.
[0019] Some modalities refer to an encoding method for encoding video data. The method includes calculating compressed video data related to a
Petition 870190062098, of 7/3/2019, p. 12/30
7/59 frameset, comprising calculating a new motion vector for a current frame from the frameset, where the new motion vector estimates motion for the current frame based on one or more reference frames, including calculate a first motion vector associated with the current frame, perform a first portion of the encoding process using the first motion vector, calculate a second motion vector associated with the current frame that is different from the first motion vector, and perform a second portion of the coding process using the second motion vector.
[0020] In some examples, calculating the first motion vector comprises calculating an unrefined motion vector, a set of unrefined motion vectors, or both, and executing the first portion of the encoding process comprises performing an encoding portion of syntax, a motion vector derivation portion, a motion vector preview derivation portion, or some combination thereof.
[0021] In some examples, executing the motion vector prediction derivation portion comprises generating a list of merge candidates, generating a list of advanced motion vector forecast candidates, or both.
[0022] In some examples, the encoding method includes performing motion vector encoding, generating motion vector prediction, or both, using the unrefined motion vector, the unrefined motion vector set, or both, so than the unrefined motion vector, the unrefined motion vector set, or
Petition 870190062098, of 7/3/2019, p. 12/31
8/59 both are not refined using a decoder side motion vector refinement tool.
[0023] In some examples, calculating the second motion vector includes calculating a refined motion vector, where the refined motion vector is calculated using an encoder side refinement technique, storing the refined motion vector in a set motion vector buffer, and performing the second portion of the encoding process comprises performing a motion compensation portion, an overlapping block movement compensation portion, an unlocking portion, or some combination thereof.
[0024] Some modalities refer to a device configured to decode video data. The device includes a processor in communication with the memory. The processor is configured to execute instructions stored in memory that make the processor receive compressed video data related to a set of frames, and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame of the frameset, where the new motion vector estimates the motion for the current frame based on one or more reference frames. The calculation includes retrieving a first motion vector associated with the current frame, performing a first portion of the decoding process using the first motion vector, recovering a second motion vector associated with the current frame other than the first motion vector, and performing a second portion of the decoding process using the second motion vector.
Petition 870190062098, of 7/3/2019, p. 12/31
9/59 [0025] In some examples, the first motion vector comprises an unrefined motion vector, the second motion vector comprises a refined motion vector, where the refined MV is refined using a side predictor refinement technique decoder, the first portion of the decoding process comprises an analysis portion, a motion vector derivation portion, or both, and the second portion of the decoding process comprises a reconstruction portion.
[0026] In some examples, the processor is configured to execute instructions stored in memory that make the processor retrieve a third motion vector associated with a second frame, where the third motion vector is a refined motion vector, perform the first portion of the decoding process using the first motion vector and the third motion vector, and performing the second portion of the decoding process using the second motion vector and the third motion vector.
[0027] Some modalities refer to a device configured to decode video data. The device includes a processor in communication with the memory. The processor is configured to execute instructions stored in memory that make the processor receive compressed video data related to a set of frames, and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame of the frameset, where the new motion vector estimates the motion for the current frame based on one or more reference frames. The calculation includes receiving
Petition 870190062098, of 7/3/2019, p. 12/31
10/59 a sign indicating an initial candidate index for a list of initial motion vector candidates, determining a first motion vector candidate in the initial motion vector candidate list and a second motion vector candidate comprises a difference that is below a predetermined threshold, remove the second motion vector candidate from the initial motion vector candidate list, do not add the second motion vector candidate to the initial motion vector candidate list or both, and calculate the new motion vector based on the candidate list and the initial candidate index.
[0028] In some examples, the processor is configured to execute instructions stored in memory that cause the processor to analyze a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors, to determine, based on the analysis , that the pair of motion vectors is along the same movement path; and add the motion vector pair to the initial motion vector candidate list.
[0029] In some examples, the processor is configured to execute instructions stored in memory that cause the processor to analyze a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors, to determine, based on the analysis , that the pair of motion vectors is not along the same motion path, separate the pair of motion vectors into two new pairs of candidate motion vectors, and add the two motion vectors
Petition 870190062098, of 7/3/2019, p. 12/31
11/59 candidates to the initial motion vector candidate list.
[0030] Some modalities refer to a device configured to encode video data. The device includes a processor in communication with the memory. The processor being configured to execute instructions stored in memory that make the processor calculate compressed video data related to a set of frames, comprising calculating a new motion vector for a current frame from the set of frames, in which the new vector of motion estimates motion for the current frame based on one or more reference frames, including calculating a first motion vector associated with the current frame, performing a first portion of the encoding process using the first motion vector, calculating a second motion vector associated with the current frame that is different from the first motion vector; and performing a second portion of the encoding process using the second motion vector.
[0031] In some examples, calculating the first motion vector includes calculating an unrefined motion vector, a set of unrefined motion vectors, or both, and performing the first portion of the encoding process comprises performing an encoding portion of syntax, a motion vector derivation portion, a motion vector preview derivation portion, or some combination thereof.
[0032] In some examples, the calculation of the second motion vector comprises calculating a refined motion vector, in which the refined motion vector is calculated
Petition 870190062098, of 7/3/2019, p. 12/35
12/59 using an encoder side refinement technique, storing the refined motion vector in a motion vector buffer set, and performing the second portion of the encoding process comprises performing a motion compensation portion, a portion of compensation for overlapping block movement, an unlocking portion, or some combination thereof.
[0033] The resources of the revealed subject were thus outlined, in a very general way, so that the detailed description that follows can be better understood, and that the present contribution to the technique can be better appreciated. There are, of course, additional resources on the disclosed subject which will be described below and which will form the subject of the attached claims. It is to be understood that the phraseology and terminology used in this specification are for the purpose of description and should not be considered as limiting.
BRIEF DESCRIPTION OF THE DRAWINGS [0034] In the drawings, each identical or almost identical component that is illustrated in several figures is represented by a similar reference character. For the sake of clarity, not all components can be labeled in all drawings. Drawings are not necessarily drawn to scale, but an emphasis is placed instead on illustrating various aspects of the techniques and devices described in this specification.
[0035] Figure 1 shows an exemplary video encoding configuration.
[0036] Figure 2 shows an exemplary technique for deriving derived vector predictions
Petition 870190062098, of 7/3/2019, p. 12/36
13/59 temporal (MVPs).
[0037] Figure 3 shows an exemplary pattern-based motion vector derivation (PMVD) technique using bilateral match blending mode.
[0038] Figure 4 shows an example of scaling a movement path.
[0039] Figure 5 shows an exemplary pattern-based motion vector derivation (PMVD) technique using the model-matching blending mode.
[0040] Figure 6 shows an exemplary decoder architecture.
[0041] Figure 7 shows an example of a decoder instruction segmentation execution when executing a decoder architecture, such as the decoder architecture shown in Figure 6.
[0042] Figure 8 shows an example of a decoder instruction segmentation run when running a decoder side predictor refinement tool.
[0043] Figure 9 shows an example of a decoder side MV (DMVR) refinement process that uses two reference images.
[0044] Figure 10 shows an exemplary two-stage search process for searching a new match block (for example, better) using the bi-predicted block.
[0045] Figure 11 is a diagram illustrating overlapping block movement compensation (OBMC) performed at the sub-block level for movement compensation block (MC) limits.
Petition 870190062098, of 7/3/2019, p. 37/121
14/59 [0046] Figure 12A shows an exemplary high-level summary of the OBMC method.
[0047] Figure 12B shows an exemplary high-level summary of the OBMC method when using an initial MV.
[0048] Figure 13 illustrates a high level representation of the MV set for the current CTU, the left column, and the line above.
[0049] Figure 14 shows an example of pairs of MV candidates on the same movement path and not on the same movement path, according to some modalities.
[0050] Figure 15 shows an exemplary decoding method for decoding video data using two MVs, according to some modalities.
[0051] Figure 16A shows an exemplary method for pruning a list of motion vector candidates, according to some modalities.
[0052] Figure 16B shows an exemplary method for generating a list of motion vector candidates, according to some modalities.
DETAILED DESCRIPTION OF THE INVENTION [0053] The inventors recognized and appreciated that various techniques can be used to improve the execution of decoder side predictor refinement techniques, such as pattern-based motion vector derivation (PMVD), bidirectional optical flow (BIO) and decoder side motion vector (DMVR) refinement. Decoder side predictor refinement tools can cause processing delays due to how motion vectors (MVs) are calculated and reconstructed. Techniques can be used to allow execution timing
Petition 870190062098, of 7/3/2019, p. 12/38
15/59 similar in comparison to the execution of traditional decoding methods that do not provide for MVs (for example, when the motion vector information is signaled from the encoder). For example, a decoding process can be adjusted so that the VMs can be reconstructed at the beginning of the process, thus allowing the decoder to pre-fetch the necessary reference pixels in a way that hides the latency cycles required to fetch the data. As an example of such techniques, the unrefined VM can be (a) restored back to the VM buffer and / or (b) unmodified, so that the unrefined VM can be used by the side refinement MV tools. decoder or used to derive the reference MV or MV candidates (for example, the merge candidate list and the advanced motion vector predictor list) for the following blocks.
[0054] The use of such techniques (for example, restoring the unrefined VM) can, however, cause blocking artifacts and / or other coding inefficiencies. For example, in addition to using the unrefined (restored) VM for analysis, the decoder can also use the unrefined VM for unlocking, overlapping block motion compensation (OBMC) and / or colocalized temporal MV derivation. The techniques described in this specification allow the decoder to use a different VM (for example, different from the unrefined VM) for processing performed after the analysis stage, such as unlocking, OBMC and / or derivation of colocalized temporal VM. For example, the first MV used for analysis (for example, the MV / MVP derivation) may be a non-MV
Petition 870190062098, of 7/3/2019, p. 12/31
16/59 refined and the second VM used for other processing, including unlocking, OBMC and / or derivation of colocalized temporal VM, can be a refined VM.
[0055] In some modalities, the decoder uses two sets of motion vectors: the decoder uses a set of MVs for a first part of the decoding process (for example, for analysis, including MV derivation and pixel prefetching) and uses the second set of VMs for a second part of the decoding process (for example, for reconstruction, including motion compensation, OBMC and / or unlock). In some embodiments, CTU line data is incorporated to allow for further processing with refined VMs (for example, using refined MV from the upper CTU line). For example, the first set of VMs can include an unrefined motion vector from a current coding tree unit line, a refined motion vector from an upper coding tree unit line, and a refined motion vector associated with a second frame. The second set of VMs can include a refined MV from the current image, and a refined MV from the other image.
[0056] These and other techniques may allow post-analysis processing to use refined MV to avoid additional blocking artifacts. Such techniques can provide a higher coding gain compared to the use of unrefined VM for the processing of VM performed after the analysis stage. These and other techniques are described later in this specification.
[0057] In the description that follows, numerous specific details are presented in relation to the systems and
Petition 870190062098, of 7/3/2019, p. 40/121
17/59 methods of the disclosed subject and the environment in which such systems and methods may operate, etc., in order to provide a complete understanding of the disclosed subject. It will be apparent to a person skilled in the art, however, that the revealed subject can be practiced without such specific details, and that certain features, which are well known in the art, are not described in detail in order to avoid complications of the disclosed subject. In addition, it will be understood that the examples provided below are exemplary, and that it is contemplated that there are other systems and methods that are within the scope of the disclosed subject.
[0058] Figure 1 shows an exemplary video encoding configuration 100, according to some modalities. Video source 102 is a video source and can be, for example, digital television, Internet-based video, video calling and / or the like. Encoder 104 encodes the video source into encoded video. Encoder 104 may reside on the same device that generated the video source 102 (e.g., a cell phone, for video calls), and / or may reside on a different device. The receiving device 106 receives the encoded video from the encoder 104. The receiving device 104 can receive the video as a video product (for example, a digital video disc or other computer-readable medium), over a broadcast network. , via a mobile network, (for example, a cellular network) and / or via the Internet. The receiving device 106 can be, for example, a computer, a cell phone or a television. The receiving device 106 includes a decoder 108 which is configured to decode the encoded video. O
Petition 870190062098, of 7/3/2019, p. 41/121
18/59 receiving device 106 also includes a display 110 for displaying the decoded video.
[0059] As explained above, part of the decoding process depends on motion vectors. In the examples, when the encoder (for example, encoder 104) does not include the final MV information directly in the encoded video, the decoder (for example, decoder 108 in the receiving device 106) may employ receiver side forecasting tools, often called of receiver side predictor refinement tools or decoder side predictor refinement tools. An example of a receiver side predictor refinement tool is the pattern-based Motion Vector Derivation (PMVD) mode, which can also be called the Above Frame Rate Conversion (FRUC) mode. PMVD is described, for example, in Document JVETF1001 of the Joint Video Exploration Team (JVET), entitled Algorithm Description of the Joint Exploration Test Model 6 (JEM 6), which is incorporated by reference in this specification in its wholeness.
[0060] Other examples of decoder side predictor refinement tools include bidirectional optical flow (BIO) and decoder side motion vector refinement (DMVR). For example, BIO has been proposed by Samsung in the third meeting of JCTVC 52 and the meeting of the VCEG, and is disclosed in JCTVC-C204-VECG and AZ05 documents. See, for example, Elena Alshina and Alexander Alshin, BiDirectional Optical Flow, October 7-15, 2010 (JCTVC-C204) (including the two Microsoft Excel spreadsheets attached), and E. Alshina et al, Known Tools Performance
Petition 870190062098, of 7/3/2019, p. 42/121
19/59
Investigation for Next Generation Video Coding, June 19-26, 2015 (VCEG-AZ05) (including the Microsoft PowerPoint presentation), the content of which is incorporated herein by reference in its entirety. BIO uses the premises of optical flow and constant movement to obtain the refinement of sample level movement. It is typically applied only for truly bidirectional predicted blocks, which is predicted from two frames of reference and one is the previous frame and the other is the last frame. In VECGAZ05, BIO uses a 5x5 window to derive the movement refinement of a sample, therefore, for an NxN block, the compensated motion results and the corresponding gradient information of a block (N + 4) x (N + 4 ) are required to derive sample-based movement refinement from the current block. And a six-lead gradient filter and a six-lead interpellation filter are used to generate the gradient information in the BIO. Therefore, the computational complexity of BIO is much greater than that of traditional bidirectional forecasting. For additional information, see D. Marpe, H. Schwarz and T. Wiegand: Context-Based Adaptive Binary Arithmetic Coding in the H.264 / AVO Video Compression Standard, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No 7, pp. 620-636, July 2003, incorporated in this specification by reference in its entirety.
[0061] The PMVD itself can be executed using modes
different, like, for example, the mode in merge in bilateral correspondence or the mode in merge in correspondence model. Normally , O mode for O decoder use is signaled in the video codi stayed. Like this,
Petition 870190062098, of 7/3/2019, p. 43/121
20/59 the encoder sends signals to the decoder to use the PMVD mode and also signals which specific PMVD mode. In some examples, a FRUC_mrg_flag is flagged when merge_flag or skip_flag is true. If the FRUC_mrg_flag is 1, then a FRUC_merge_mode is flagged to indicate whether the bilateral match blending mode (for example, described further in conjunction with Figures 2-4) or model match blending mode (for example, described further) together with Figure 5) is selected.
[0062] In summary, the two PMVD modes use decoded pixels to derive the motion vector for the current block. A new temporal motion vector (MVP) forecast called temporal derived MVP is derived by scanning all MVs in all reference frames. An image usually refers to a number of frames (for example, an image includes sixteen frames). These reference frames are placed in one or two lists of reference images. For slice P, only a list of reference images is used. For slice B, two lists of reference images are used. Generally, for slice B, two reference image lists are used to store past and future images, which are generally called LIST_0 for old images and LIST_1 for future images.
[0063] To derive the time-derived MVP from LIST_0, for each MV of LIST_0 in the reference frames of LIST_0, the MV is sized to point to the current frame. The block that is pointed out by the MV dimensioned in the current frame is the current target block. The MV is also sized to point to the reference image for which refldx is equal to 0
Petition 870190062098, of 7/3/2019, p. 44/121
21/59 in LIST_O for the current target block. The additional dimensioned VM is stored in the MV field of LIST_0 for the current target block. Figure 2 shows an example 200 of deriving the time-derived MVPs. The decoder scans all LIST_0 VMs in reference images of LIST_0 for which the refldx is equal to 1. For a LIST_0 MV (shown by arrows 202, 204), a scaled VM that points to the current image is derived to each MV of LIST_0 (shown by dotted arrows 206 and 208 for reference image 201). A 4x4 block 210, 212 in the current figure 205 is pointed out by each of the staggered MVs. Then, another dimensioned MV 214, 216 is assigned to the 4x4 blocks pointed 210, 212, respectively, in the current image, where the dimensioned MV 214, 216 is along the associated dimensioned MV 202, 204, but the starting point is the current figure 205 and the end point is reference image 218 with refldx equal to 0 in LIST_0.
The decoder scans all MVs in all 4x4 blocks in all reference images to generate the LIST_0 and LIST_1 MVPs derived from the current frame. For each VM, the VM is sized to obtain the cross block in the current image. The decoder then calculates the scaled MVP and assigns it to the cross block (as shown as the block pointed by the dotted arrows 206, 208).
[0064] Figure 3 shows an example of the merging mode of bilateral PMVD correspondence. For bilateral correspondence, the decoder finds the two most similar reference blocks in LIST_0 and LIST_1 that are on the same path. As shown in Figure 3, for the current image (or pic) 300, the decoder selects a macroblock (or block) 302 in the RefO reference frame
Petition 870190062098, of 7/3/2019, p. 45/121
22/59
304 of LIST_0 and a second block 306 in the REF 308 reference frame of LIST_1. The decoder essentially assumes that the movement is constant, and uses the midpoint of both macroblocks to generate the 310 motion path. The decoder uses the 310 motion path to find the current forecast macroblock (or block) 312 in the current image. 300. The decoder calculates the difference between block 302 and block 306. If there is only a small difference, then the decoder knows that the blocks are very similar. In some examples, the decoder can calculate the sum of the absolute distance (or SAD) to calculate the difference between the two blocks. The decoder changes the movement path to minimize the difference between the blocks.
[0065] The decoder builds the initial motion vector (MV) list in LIST_0 and LIST_1, respectively. The decoder uses eleven candidates for the list, including seven merge candidate MVs and four time-derived MV predictions (or MVPs). The decoder evaluates these eleven candidates to select the best starting point. In particular, the decoder looks for a pair between the two neighboring frames. When considering the candidates for each list, the decoder analyzes the 22 motion vectors to derive 22 pairs of motion vectors. The decoder generates the MV pairs by dimensioning the movement path. For each VM in a list, a pair of VMs is generated by the composition of this VM and the mirrored VM that is derived by scaling the VM to the other list. For each pair of MV, two reference blocks are offset using this pair of MV. Figure 4 shows an example 400 of
Petition 870190062098, of 7/3/2019, p. 46/121
23/59 dimension a movement path. In particular, the movement path 402 of the current image to refl in LIST_0 is scaled as shown with the movement path 404 of the current image to refO in LIST_1. The decoder calculates a cost for each of the 22 pairs of motion vectors (for example, using SAD) and selects the MV pair with the lowest cost as the starting point of the bilateral match blending mode.
[0066] The decoder then refines the selected MV pair. The decoder searches for different blocks around the starting point to decide which block is the best match. In some instances, the current PU is divided into sub-PUs. The depth of the sub-PU is signaled in the sequence parameter set, SPS (for example, 3). In some examples, the minimum sub-PU size is a 4x4 block. For each sub-PU, several initial VMs in LIST_0 and LIST_1 are selected, which includes VMs derived from PU-level VM, zero MV, current sub-PU HEVC colocalized TMVP and right lower block, sub-derived temporal MVP -PU current, and MVs on the left and above PUs / sub-PUs. By using the similar mechanism in PU level research, the best MV pair for the sub-PU is selected. In some examples, the decoder uses a Diamond Search algorithm to search the different blocks. Then the final MV pair is used as the best PU-level and sub-PU-level MV pair.
[0067] In summary, in some examples, the bilateral mail merge mode uses the MV lists first, evaluates the candidate MV pairs to obtain the initial MV pair, and then refines the pair to determine the best
Petition 870190062098, of 7/3/2019, p. 47/121
24/59 final MV pair.
[0068] For model matching merge mode, the assumption is that for the decoder to decode the current block, the decoder can use the neighboring block using a model to find a better match. Thus, the decoder can use the neighboring block to find a better match and then use the best match motion vector. Figure 5 shows an exemplary technique for the model matching merge mode. Referring to Figure 5, the model 502 includes the reconstructed pixels of four lines above the current block 504 and from four columns to the left of the current block 504 to perform the correspondence in Ref 0 506 to Current Image 508. Therefore, different from the mode bilateral match merge, model match merge mode does not use two frames of reference - it uses only one frame of reference.
[0069] Like the bilateral match blending mode, two-stage matching is also applied to model matching. In PU level correspondence, eleven initial MVs at LIST_0 and LIST_1 are selected respectively. These VMs include seven MVs from merge candidates and four MVs from temporal derived MVPs. Two sets of different starting VMs are generated for two lists. For each MV in a list, the model's SAD cost with the MV is calculated. The MV with the lowest cost is the best MV. Then, diamond research is carried out to refine the MV. The refinement accuracy is 1/8-pel. The refinement search range is restricted to ± 8 pixels. The final MV is the MV
Petition 870190062098, of 7/3/2019, p. 48/121
25/59 derived from PU level. The MVs at LIST_0 and LIST_1 are generated independently.
[0070] For the second stage, sub-PU level research, the current PU is divided into sub-PUs. The depth of the sub-PU is signaled in SPS (for example, 3). The minimum sub-PU size is 4x4. For each sub-PU within the limits of the left or top PU, several initial MVs in LIST_0 and LIST_1 are selected, which includes MVs derived from PU-level MV, zero MV, current sub-PU HEVC colocalized TMVP and block lower right, MVP derived from the current sub-UP, and MVs from the left and upper PUs / sub-PUs. By using the similar mechanism in PU level research, the best MV pair for the sub-PU is selected. Diamond research is carried out to refine the MV pair. Motion compensation for this sub-PU is performed to generate the predictor for this sub-PU. For those PUs that are not within the limits of left or top PU, the second stage, the sub-PU level survey, is not applied and the corresponding VMs are defined as equal to the VMs in the first stage.
[0071] When one pair of MV in bi-forecast is flagged (for example, to mode in merge, to select a candidate merge bi· -foreseen), one
decoder side MV (DMVR) refinement process can be performed to refine the LIST_0 and LIST_1 MVs for better coding efficiency. An example of the DMVR process was proposed by HiSilicon in JVET-D0029, entitled Decoder-Side Motion Vector Refinement Based on Bilateral Template Matching, which is incorporated by reference in this specification in its entirety. Figure 9
Petition 870190062098, of 7/3/2019, p. 49/121
26/59 shows a DMVR 900 process that uses the reference image 0 902 and the reference image 1 904, according to some examples. In the DMVR 900 process, a bi-predicted block (the bi-predicted model) is generated using the bi-prediction of the reference block 906 of MV0 908 and the reference block 910 of MV1 912. The bi-predicted block is used as a new current block Cur '(in place of the original current block 914) to perform the movement estimate to search for a block of best correspondence in Reference Image 0 902 and Reference Image 1 904, respectively. The refined MVs (MV0 'and MV1', not shown in Figure 9) are used to generate a final bi-predicted forecast block for the current block.
[0072] In some modalities, DMVR uses a two-stage search to refine the MVs in the current block to generate MV0 'and MV1'. Figure 10 shows an exemplary two-stage search process 1000 to search for a new block of correspondence (for example, better) using the bi-predicted block, according to some modalities. As shown in Figure 10, for a current block in Reference Image 0, the cost of the current MV candidate is first assessed in square block 1002 (also referred to as L0_pred). For example, the cost of block 1002 can be calculated as the sum of the absolute difference (SAD) of (Cur '- L0_pred) to calculate the initial cost. In the first stage of the search, a full pixel square search is performed around block 1002. As shown in this example, eight candidates (the eight large circles 1004A-1004H in Figure 10, collectively referred to as 1004) are evaluated. The distance between two adjacent circles (for example, 1004A and 1004B) and the
Petition 870190062098, of 7/3/2019, p. 50/121
27/59 distance between square block 1002 and the adjacent circle (for example, 1004B) is one pixel. An 8-lead filter can be used to generate the eight candidate blocks for each of the 1004 blocks, and the cost of each candidate can be assessed using the SAD. The candidate of the eight 1004 candidates with the best cost (for example, the lowest cost, if using SAD) is selected as the best MV candidate in the first stage, shown as 1004H in this example. In the second stage, a half-pixel square search is performed around the best MV candidate (1004H, in this example) from the first stage, as shown as eight small circles 1006A-1006H (collectively half-pixels 1006). An 8-lead filter can also be used to generate a candidate block for each of the half pixels (1006), and the cost can be determined using the SAD. The MV candidate with the best cost (for example, lowest cost) is selected as the final MV that is used for the final motion compensation. The process is repeated for Reference Image 1 904 to determine the final MV for Reference Image 1. The final bi-predicted block is regenerated using the refined VMs.
[0073] Figure 6 shows an exemplary 600 decoder architecture, according to some modalities. The Entropy Decoder includes, for example, a CABAC or CAVLC entropy decoder, which analyzes the syntax from the bit stream. The ColMV DMA 610 stores the colocalized temporal VMs. The MV 612 Dispatcher reconstructs the MVs of the blocks and issues the memory fetch instruction to the MC 614 and DRAM cache (not shown) through the 616 memory interface arbiter.
Petition 870190062098, of 7/3/2019, p. 51/121
28/59
Inverse Transform 618 includes inverse quantization and Inverse Transform (IQIT) that generates the reconstructed residual 620. Prediction block 622 generates inter-motion compensation and intraday predictors. Unlock 624 is designed to reduce block artifact and Rec DMA 626 stores the reconstructed pixels for the external DRAM. Additional details of exemplary components of this architecture are discussed in C.-T. Huang et al, A 249MPixel / s HEVC video-decoder chip for Quad Full HD applications, IEEE International Solid State Circuit Conference (ISSCC) Technical Publications Summary, pp. 162-163, February 2013, which is incorporated herein by reference in this specification in its entirety. It is noteworthy that the architecture is divided into four stages in order to segment the instructions of the architecture: the EC 602 stage, the IQIT (inverse and inverse transformed quantization) / search stage 604, the reconstruction stage 606 and the 608 loop filter. In HEVC and H.264, the final MV can be derived both in the EC 602 stage (which includes analysis) and in the 606 reconstruction stage. In some implementations, the decoder derives the final MV in the analysis stage , and prefetch the reference pixel data needed in the analysis stage (EC 602 stage). This can be done, for example, to reduce / hide the DRAM access time.
[0074] Figure 7 shows an example of a decoder instruction segmentation execution 700 when executing a decoder architecture, such as the decoder architecture shown in Figure 6, according to some modalities. Figure 7 includes the analysis stage 702,
Petition 870190062098, of 7/3/2019, p. 52/121
29/59 during which the motion vectors are reconstructed as described above. The IQ / IT 704-1 stage generates the reconstructed residual for the current block. The Reference Pixel Search stage 704-2 fetches reference pixel data from memory. Reference frames are often stored in external memory, as DRAM. Thus, if the decoder wants to compensate for motion in a frame of reference, the decoder must first go to external memory to retrieve the reference data. Typically, a lot of latency is required to obtain data from external memory. The Intra / MC (Motion Compensation) 706 reconstruction stage performs the forecast. The 708 unlocking (DB) / Adaptive Sample Displacement (SAO) stage performs the loop filtering process to improve the quality of the decoded frame.
[0075] Generally, the decoder decodes CU0 first, then CU1 and so on. To give an example using CU0, in tO, the decoder decodes CU0 at analysis stage 702, including the reconstruction of the VMs. Then, in tl, CU0 moves to the IQ / IT 704-1 stage. In order to make motion compensation in the Intra Reconstruction / MC 706 stage, the decoder needs to do a prefetch in the previous stage (the Reference Pixel 704-2 search stage).
[0076] As can be seen in Figure 7, in order to hide the delay time to fetch data from memory (for example, so that it does not affect the execution of instruction segmentation), since the decoder needs to know the vector of movement before the reconstruction performed in the Intra Reconstruction / MC 706 stage, the data are pre-fetched in the
Petition 870190062098, of 7/3/2019, p. 53/121
30/59 reference pixel search stage 704-2 and stored in local memory (for example, SRAM memory or cache). For example, in MPEG-2/4, H.264 / AVC and HEVC video decoders, MVs can be reconstructed in the analysis stage. According to the reconstructed VMs, the necessary reference pixels can be searched for in the DRAM and stored in local memory, for example, SRAM memory or cache. In the Intra / MC 706 Reconstruction stage, reference data can be loaded from local memory without latency cycles.
[0077] However, the decoder side predictor refinement tools use the neighboring block (s) to derive the motion vector (for example, PMVD, for example, as the blending mode of model matching uses the neighboring block to derive the motion vector). However, the model block is not generated until the third stage (the Intra Reconstruction / MC 706 stage). For example, when PMVD is applied, the final VMs of a PMVD coded block depend on the PMVD research process in the Intra Reconstruction / MC 706 stage, which means that VMs cannot be reconstructed in Analysis 702 and therefore, prefetch data is not feasible at the 704-2 Reference Pixel search stage.
[0078] Figure 8 shows an example of a decoder instruction segmentation run when running a decoder side predictor refinement tool. For example, and using PMVD as an example, at time t2, since the MVs for CU0 depend on the PMVD research process in the Intra Reconstruction / MC 706 stage (which is also performed at t2), the MVs cannot be
Petition 870190062098, of 7/3/2019, p. 54/121
31/59 reconstructed in Analysis stage 702 for CU01 (at time tl), and thus the data cannot be pre-fetched for CU1 in t2 in the reference pixel search stage 704-2. This problem similarly affects the processing of each CU and therefore ultimately results in only one finish of CU processing for two time intervals. For example, Figure 8 shows that for t4 and t5, the decoder just completes processing CU1, compared to Figure 7, where CU1 completes processing at t4 and CU2 completes processing at t5.
[0079] Data prefetching problems can be addressed when decoder-side forecasting refinement techniques (eg, PMVD) are used for decoding. For example, the techniques allow data to be pre-fetched in a way that still hides latency cycles, as shown in Figure 7, instead of causing a delay as shown in Figure 8. To facilitate the illustration, the discussion below, PMVD is referred to as an example, although a person skilled in the art may appreciate that the techniques can be adapted to other decoder-side forecasting refinement techniques (eg, BIO and DMVR).
[0080] According to some modalities, the original candidate MV is preserved in the MV buffer for the next decoding process. In some examples, the selected merge candidate VMs (for example, the initial or unrefined VMs) are stored back in the VM buffers so that the decoder can reference neighboring blocks and colocalized blocks / images. Therefore, according to some examples, the MC of the PMVD block (for
Petition 870190062098, of 7/3/2019, p. 55/121
32/59 example, performed in the Intra Reconstruction / MC 706 stage) uses the VMs derived from PMVD, but the selected merge candidate VMs are stored back in the VM buffers for future reference. This can allow, for example, that the VMs are reconstructed in the Analysis 702 stage, and the reference pixels can be pre-fetched in the 704-2 stage. If the current block is a PMVD encoded block, a larger reference block (for example, including the refinement search interval) can be prefetch. Therefore, in some instances, the MV is not refined for the current block, but the decoder uses the refined MV for compensation.
[0081] In some examples, the decoder can be configured to not change the MV in the MV buffer. For example, the decoder can store the starting point (for example, the initial MV (s)) in the MV buffer, and do the refinement to generate a MV refinement that is used only to generate motion compensation data, without change the MV in the MV buffer. The MV buffers for future reference (for example, the merge candidate list and the AMVP candidate list generation) are not changed.
[0082] In some examples, the decoder may use a separate buffer for refinement. For example, the decoder can retrieve the initial VM, perform the PMVD and perform the refinement without storing the refined VM in the original VM buffer - the decoder stores the refined VM in a temporal buffer.
[0083] In some examples, the decoder may signal an initial candidate for the PMVD. For example, the decoder can signal an initial candidate index
Petition 870190062098, of 7/3/2019, p. 56/121
33/59 which is used to select an initial VM from a list of VM candidates. This can be done, for example, so that the decoder knows which candidate of the eleven candidates will be used as the initial candidate for the PMVD. The decoder can generate the first eleven candidates first, and the encoder can signal the decoder which is the best. This signaling can allow, for example, the decoder to skip model matching and proceed directly to refinement, as the decoder recognizes the initial candidate (for example, the decoder can perform refinement using model matching and the Diamond Search technique for refine the MV around the initial candidate). Although the MV is refined by diamond research, the proposed method will only store the initial candidate, not the refined motion vector.
[0084] In some examples, for PMVD (for example, including bilateral match blending mode and model match blending mode), the MVs of LIST_0 and LIST_1 in merge candidates are used as start MVs. In some examples, a better VM candidate can be derived implicitly by researching all of these VMs. This can require a lot of memory bandwidth. In this example, the merge index for bilateral match merge mode or model match merge mode is flagged. The flagged merge index can indicate, for example, the best starting MVs in LIST_0 and LIST_1 in the model-matching merge mode, and the two best MV pairs (one is derived from LIST 0 and the other is derived from LIST 1 ) in
Petition 870190062098, of 7/3/2019, p. 57/121
34/59 bilateral mail merge mode. By flagging the merge index, the model matching step can be limited to, for example, a refinement search around the flagged merge candidate. For bilateral correspondence, the decoder can perform the cost estimate to select the best pair of MV from the two pairs of MV and perform the refinement research. For bilateral correspondence, if the merge candidate is a unidirectional VM, its corresponding VM in another list can be generated using the mirrored (scaled) VM. In some modalities, using a predefined VM generation method, the initial VMs in LIST_0, LIST_1, and / or the VM pairs are known. The best starting MVs in LIST_0 and / or LIST_1, or the best pair of MVs are flagged explicitly to reduce the bandwidth requirement.
[0085] In some examples, when a merge index is signaled, the decoder can still use the selected MV to exclude or select some candidates in the first stage (PU level correspondence). For example, the decoder can exclude some VMs in the candidate list that are distant from the selected VMs. As another example, the decoder can choose N MVs from the list of candidates that are closest to the selected MV, but in different frames of reference.
[0086] As explained in this specification, some techniques provide the signaling of the initial MV (for example, to signal the initial candidate, as described above for the PMVD) by generating a list of initial MV candidates and signaling a candidate index . Per
Petition 870190062098, of 7/3/2019, p. 58/121
35/59 use PMVD as an example, since PMVD performs VM refinement, two similar initial VM candidates can have the same refined final VM. Thus, similar VMs in the candidate list generation can be removed from the candidate list, or pruned, as they can have the same refined final MV, since the PMVD looks for a local minimum around the initial candidate.
[0087] A list of motion vector candidates can be pruned and / or created using the techniques described in this specification. Figure 16A shows an exemplary method 1600 for pruning a list of motion vector candidates, according to some modalities. Figure 16B shows an exemplary method 1650 for creating a list of motion vector candidates, according to some modalities. For example, the list can be empty first, and whenever a new candidate can be added, techniques can determine whether the new candidate is redundant or not with any existing motion vector candidate on the list. If it is redundant, the new candidate will not be added.
[0088] Referring to Figure 16A, in step 1602, the decoder stores a list of initial motion vector candidates. For example, the traditional merge candidate list generation process (for example, described above) can be used to generate the PMVD merge candidate list. Referring to steps 16041610 in Figure 16A, for VM derivation, a newly added VM can be compared with the VMs that are already on the candidate list. If one (or more) of the VMs is similar to the newly added VM, the newly added VM will be removed
Petition 870190062098, of 7/3/2019, p. 59/121
36/59 of the list. In particular, at step 1604, the decoder compares the new candidate with an existing candidate on the list of candidate MVs to determine a similarity of the candidates. In step 1606, the decoder compares the similarity with a predetermined threshold. If the similarity is not below the predetermined threshold, the decoder removes the candidate in 1608 (and proceeds to step 1610). Otherwise, if the similarity is below the predetermined threshold, the method proceeds to step 1610. In step 1610, if there are more candidates in the list of candidate VMs to be verified, method 1600 will return to step 1604 for each remaining candidate in the list of candidates. Otherwise, if all MV candidates in the MV candidate list have been compared to the new candidate (and each comparison was above the threshold in step 1606), in step 1612 the decoder will keep the new MV candidate in the list of MV candidates. In step 1614, method 1600 takes the first N candidates from the initial motion vector candidate list. The value of N can be a predetermined value. N cannot be used to ensure that the final list size is below a predetermined maximum size. In some examples, if the initial motion vector candidate list has fewer than N candidates, then step 1614 does not modify the initial motion vector candidate list. Method 1600 proceeds to step 1616 and ends.
[0089] Referring to Figure 16B, method 1650 includes some steps similar to method 1600 of Figure 16A, including steps 1602, 1604, 1606, 1610 and 1616, as discussed further below. In step 1602, the decoder
Petition 870190062098, of 7/3/2019, p. 60/121
37/59 stores a list of initial motion vector candidates. For example, the list of initial motion vector candidates may be empty. In step 1652, the decoder generates a new motion vector candidate. In step 1604, the decoder compares the new candidate with an existing candidate on the initial MV candidate list to determine a similarity of the candidates. In some examples, if there is still no candidate on the initial VM candidate list, although it is not shown, method 1650 can proceed directly to step 1654 and add the candidate to the initial MV candidate list. In step 1606, the decoder compares the similarity with a predetermined threshold. If the similarity is not below the predetermined threshold, the decoder proceeds to step 1654 and does not add the new MV to the list (and proceeds to step 1610). If the similarity is below the predetermined threshold, method 1650 proceeds to step 1654 and adds the candidate to the list. From step 1654, method 1650 proceeds to step 1656 and determines whether the size of the list is equal to a predetermined size. Otherwise, the method proceeds to step 1610. Otherwise, the method proceeds to step 1616 and ends. In step 1610, if there are more candidates to check, method 1650 returns to step 1604 for each remaining candidate. Otherwise, method 1650 proceeds to step 1616 and ends.
[0090] In some modalities, the similarity of the MV can be determined based on whether (a) the reference frame indexes (or POC) are the same, and / or (b) the MV difference is less than a threshold . For example, the sum of the distance
Petition 870190062098, of 7/3/2019, p. 61/121
38/59 of absolute MV of MVx and MVy can be calculated using Equation 1:
Equation 1: abs (MVxO - MVxl) + abs (MVyO - MVyl) <K;
where K is a pixel distance, for example, half a pixel, a whole pixel, two whole pixels, three whole pixels, three and a half whole pixels, etc.
[0091] As another example, the absolute MV distance from MVx and the absolute MV distance from MVy can be compared with K, using Equation 2 below:
Equation 2: abs (MVxO - MVxl) <K && abs (MVyO - MVyl) <K;
where K, as in equation one, can be half a pixel, a whole pixel, two whole pixels, three whole pixels, three and a half whole pixels, etc.
[0092] In some modalities, for example, for the bilateral correspondence blending mode, the MV candidate pair can be checked to determine if they are on the same movement path. For example, the original merge candidate MV can be checked to determine if the MVs at LIST_0 and LIST_1 are on the same motion path. Figure 14 shows an example of whether the MV candidate pairs are on the same movement path, according to some modalities. If the MVs in LIST_0 and LIST_1 are on the same movement path shown in 1402, the MV candidate is added to the list; otherwise, if the MVs at LIST_0 and LIST_1 are not in the same movement path shown in 1404, the MVs at LIST_0 and LIST_1 will be separated into two candidate MVs. For each of the two separate candidate VMs, the missing list VM is populated with the mirrored VM from the other list, as shown by 1406 and 1408. As another example, each
Petition 870190062098, of 7/3/2019, p. 62/121
39/59 bi-prediction MV candidate is separated into two candidates. One candidate is the MV of LIST_0 and the other is the MV of LIST_1. Then, each candidate (for example, each unipredictable candidate) is used to generate the missing list VM by filling the missing list VM with the mirrored MV of the valid list.
[0093] In PMVD MV search, a MV search method can be predefined (for example, a three-step diamond search). For example, for a Diamond search, the step size of the first step Diamond search is half a pixel (half a pixel). The step size of the second step cross-search is a quarter of a pixel (quarter pixel). The step size of the third step cross-search is 1/8 of a pixel (1/8 pixel). In some modalities, both (a) the blending index of the initial MV and (b) a coarse-grained MVD are signaled. The MVD can be the refinement position index of the first step diamond survey and / or a conventional MVD. The MVD unit can be 1/16 pixel, 1/8 pixel, quarter pixel, half pixel, one pixel, two pixels or any predefined unit. The MVs of the selected merge index plus the flagged MVD (or the refinement position MV) can be used as the initial PMVD VM, which is stored in the MV buffer for merge candidate and AMVP candidate derivation reference. In some examples, for the encoder and / or the decoder, the PMVD search may start from the initial PMVD MV. The final PMVD-derived VM is for MC only. The initial VMs of the PMVD coded block can be reconstructed at the analysis stage.
Petition 870190062098, of 7/3/2019, p. 63/121
40/59 [0094] In some examples, only one MVD, and / or just one MVD refinement position index, is flagged. If the merge candidate is a bi-predicted candidate, the MVD will only be added to LIST_0 or LIST_1. For the bilateral mail merge mode, if the MVD is added to LIST_0, the start MV of LIST_1 can be the mirrored VM of the start MV of LIST_0.
[0095] In some examples, the coarse-grained MVD is not encoded, but derived in the search process in the decoder. For example, we can divide the research process into three stages, the first step diamond survey, the second step cross survey and the third step cross survey. Coarse-grain MVD can be the result of the research process in first-step diamond research or second-step cross-research.
[0096] In HEVC, an image is divided into encoding tree units (CTUs), which are the basic processing unit for HEVC. CTUs are encoded in the order of raster scan. In an instruction segmentation decoding architecture, most of the information from the top CTU lines is available at the analysis stage (for example, including the VM information), since the line has already been processed. In some examples, the MVs derived from the decoder side in the CTUs from the upper CTU line can be referenced (or used), for example, for the merge candidate list and AMVP list generation, since the information is available in the analysis stage. The decoder can use the VMs derived from these CTUs, although the MVs derived from the decoder side in the current CTU line cannot be used,
Petition 870190062098, of 7/3/2019, p. 64/121
41/59 as they are not available.
[0097] Therefore, in some modalities, a CTU line restriction can be used with the techniques described in this specification, so that the PMVD-derived VMs in the upper CTU line can be referred (for example, when not referring to to the VM of the PMVD encoded block) or be used (for example, when storing the candidate merge VMs, store the candidate merge VMs and mirrored VMs, send the merge index to PMVD and bilateral mirrored VMs (and evaluate only one MV ), signal the blending index and coarse grain MVD, and / or AMVP and PMVD mode).
[0098] For example, consider the techniques discussed in this specification when storing the candidate merge VMs, store the candidate merge VMs and mirrored VMs, and send the merge index to PMVD and bilateral mirrored VMs (and evaluate only one MV) . When referring to the PMVD encoded blocks in the current CTU line, the selected merge candidate MVs can be used for merge candidate derivation and AMVP candidate derivation. When referring to the PMVD encoded blocks in the upper CTU line, the final PMVD derived VMs can be used.
[0099] As another example, consider the techniques discussed in this specification referring to not referring to the MV of the coded block of PMVD. When referring to the PMVD encoded blocks in the current CTU line, the MVs are not available for merge candidate derivation and AMVP candidate derivation. When referring to the PMVD coded blocks in the upper CTU line, the MVs
Petition 870190062098, of 7/3/2019, p. 65/121
42/59 final PMVD derivatives are used.
[00100] The CTU line restriction can be changed to CTU restriction or any predefined or derived area restriction. For example, when not referring to the PMVD encoded block MV, if the CTU restriction is applied, the PMVD encoded block MVs in the current CTU will not be available as long as the PMVD encoded block MVs in different CTUs are available .
[00101] Overlapping block motion compensation (OBMC) is a coding tool that can be used to reduce block artifacts in motion compensation. An example of how OBMC is performed at the block boundaries as described in JVET-F1001, entitled Algorithm Description of the Joint Exploration Test Model 6 (JEM 6), which is incorporated by reference in this specification in its entirety. To facilitate the illustration, the description that follows refers to JVETF1001, but this description is not intended to be limiting.
[00102] For OBMC, in some examples, the neighboring block is compensated by the MV of the current block. As shown in Figure 11, which is a summary of Figure 14 in section 2.3.4 of JVET-F1001, OBMC is performed at the sub-block level for all motion compensation (MC) block limits, where the size sub-block equals 4 χ 4. JVET-F1001 explains that when OBMC is applied to the current sub-block, in addition to the current motion vectors, the motion vectors of four connected neighboring sub-blocks, if available and if they are not identical to the current motion vector, they are also used to derive the forecast block for the sub-block
Petition 870190062098, of 7/3/2019, p. 66/121
Current 43/59. These multiple forecast blocks are based on multiple motion vectors that are combined to generate the final forecast signal for the current sub-block.
[00103] JVET-F1001 further explains that the prediction block based on motion vectors of a neighboring sub-block is denoted as Pn, with N indicating an index for the neighboring sub-blocks above, below, left and right and block of prediction based on motion vectors of the current subblock is indicated as Pc. When Pn is based on the movement information of a neighboring sub-block that contains the same movement information for the current sub-block, OBMC is not performed from Pn. Otherwise, all Pn samples are added to the same sample in P c , that is, four Pn rows / columns are added to Pc. Weighting factors {1/4, 1/8, 1/16, 1/32} are used for Pn and weighting factors {3/4, 7/8, 15/16, 31/32} are used for Praça. The exceptions are small blocks of MC, (when the height or width of the coding block is equal to 4 or a CU is coded with the sub-CU mode), for which only two rows / columns of Pn are added to Praça. In this case, weighting factors {1/4, 1/8} are used for Pn and weighting factors {3/4, 7/8} are used for Pc. For Pn generated based on the movement vectors of the vertically (horizontally) neighboring sub-block, samples in the same line (column) of Pn are added to Pc with the same weighting factor. Figure 12A shows an exemplary high-level summary of the OBMC 1200 method. MVA 1202 represents the original MV. Using a decoder side predictor technique, MVA 1202 is refined to MVA '1204. MVA' 1204 is used for OBMC at block boundaries,
Petition 870190062098, of 7/3/2019, p. 67/121
44/59 resulting in mixed sections 1206 and 1208 based on MVA '1204.
[00104] As described in this specification, techniques are provided to allow similar execution timing of decoder side predictor refinement techniques compared to the execution of traditional decoding methods. For example, some modalities include using the initial VM (not the refined VM) or the partial refined VM (initial VM + offset of the signaled VM) to reference the neighboring block in the analysis stage and the prefetch stage (for example, stages 602 and 604 in figure 6). In some modalities, such techniques may result in the use of the initial VM for other processing, such as unlocking, OBMC, and derivation of the colocalized VM temporally. Using the initial VM for other such processing can introduce blocking artifacts. For example, some blocking artifacts can be found when the OBMC and / or the unlock uses a restored VM, so that the OBMC or the unlock is not performed using refined VMs. Figure 12B shows an exemplary result 1250 from the application of OBMC using the initial VM 1202 (for example, a restored VM). Unlike Figure 12A, with mixed sections 1206 and 1208 based on MVA '1204, mixed sections 1252 and 1254 in Figure 12B are based on MVA 1202. This can cause, for example, a block artifact, since the block neighbor is MVA '1204, but the MV used for mixing is MVA 1202.
[00105] To solve these post-analysis processing problems, several VMs can be used. Figure 15 shows an exemplary 1500 decoding method
Petition 870190062098, of 7/3/2019, p. 68/121
45/59 to decode video data that uses two MVs, according to some modalities. In step 1502, the decoder receives compressed video data related to a set of frames. In steps 1504-1510, the decoder calculates, using a decoder side predictor refinement technique, a new motion vector for a frame from the frameset. In particular, at step 1504, the decoder retrieves (for example, from a first buffer) a first motion vector (for example, an unrefined MV) associated with the current frame. In step 1506, the decoder performs a first portion of the decoding process (for example, the analysis stage, a MV / MVP derivation and / or a MV refinement technique) using the first motion vector. In step 1508, the decoder retrieves a second motion vector (for example, from a second buffer) associated with the current frame (for example, a refined MV). In step 1510, the decoder performs a second portion of the decoding process (e.g., the reconstruction stage, the motion compensation portion, an unlocking portion and / or OBMC) using the second motion vector.
[00106] With reference to steps 1504-1510, in some embodiments, two sets of VMs may be used: (1) a first set of VMs used for the analysis stage (for example, analysis stage 702 in Figure 7), including for MV / MVP derivation and / or pixel prefetching, and (2) a second set of MVs for reconstruction (for example, during Rec. Intra / MC 706 stage in Figure 7), including for motion compensation, OBMC and / or
Petition 870190062098, of 7/3/2019, p. 69/121
46/59 unlock. The first set of VMs can store the original (unrefined) VM, and the second set of VMs can store the refined VM. Such techniques can facilitate, for example, OBMC and / or unlocking to use the modified MV. Using the modified VM can avoid additional blocking artifacts (for example, which can be caused by performing OBMC and / or unlocking using unrefined VMs) and / or can provide a better coding gain compared to using unrefined VMs.
[00107] For example, to deal with potential blocking artifacts, an individual unrefined VM set can be used at the analysis stage (for example, for generation of merge candidate list and / or generation of AMVP candidate list ). According to some examples, the VMs in the unrefined VM set are not refined by a decoder side VM refinement tool, and can be used for VM analysis and VM reconstruction. Then, the reconstructed VMs are used to search for reference pixels. VMs refined by a decoder side VM refinement tool can be stored in another set of VM buffers. Refined VMs can be used for motion compensation, OBMC, unlocking and / or other tools that do not alter the analysis process according to the VMs.
[00108] Since the VMs in other previously refined images are already refined, the use of the refined VMs in these other images will not introduce the prefetch problem described above in conjunction with Figure 8. In some modalities, the VM set refined can be used
Petition 870190062098, of 7/3/2019, p. 70/121
47/59 for derivation of temporal MV in the analysis step and in the stage of MV reconstruction. For example, for generation of merge candidate list and generation of AMVP candidate list, when deriving spatial neighboring VMs, the unrefined VM set is used, while when deriving temporal colocalized VMs, the refined VM set is used.
[00109] The MVs in the upper CTU lines can now be refined, as discussed above. In some modalities, the first set of VMs (for example, used for the analysis stage) can store the VMs of the second set of VMs (for example, used for reconstruction) if the VM is in the upper CTU line. For example, if the VM is on the top CTU line, the analysis stage can access the second set of VMs for the top CTU line. This can, for example, reduce the size of the unrefined VM buffer. For example, the size of the buffer can be reduced only by the need to maintain the MV of a CTU block row and a CTU block column. VMs that will not be referenced by neighboring space blocks in the current CTU line at the analysis stage and the MV reconstruction stage (for example, for generating merge candidate list and generating AMVP candidate list) can be discarded. Thus, in some modalities, only refined VMs need to be stored. In the hardware implementation, unrefined VMs can be stored only in the segmentation stage of analysis instructions and in the segmentation stage of prefetch instructions (for example, stages 702 and 704-2 in Figure 7). In some modalities, the techniques can use the refined VMs to
Petition 870190062098, of 7/3/2019, p. 71/121
48/59 from the CUs that were processed before the previous N CUs. For example, if we find that CUs before the last 5 decoded CUs are ready to be used (for example, without introducing prefetch problems), the MV in the CUs before the last 5 decoded CUs can use the refined MV. In some embodiments, the same concept can be applied to the chunk / slice limit. For example, if the reference VM is on a different block or slice, the analysis stage can access the second set of VMs for the VM on the different piece or on the different slice.
[00110] Regarding the first set of VMs used for the analysis stage, the first set of VMs (the unrefined VMs) can be used for the generation of Merge / AMVP candidate and / or generation of initial VM. The generated MV is used to search for reference pixels. In some embodiments, if the CTU row constraint is not applied, the MV set contains (a) the unrefined MV of the current image (for example, the left column, row above, and current CTU), and (b ) the refined MV of the other image (for example, the temporal colocalized image). For example, referring to Figure 13, the MV set contains the unrefined MV of the current image for the current CTU 1302, the left column 1304 and the line above 1306. In some embodiments, if the CTU line constraint is applied , the VM set contains (a) the unrefined MV of the current CTU row (left column and current CTU), (b) the refined MV of the top CTU row (above row) and (c) the refined MV of the another image. For example, referring to Figure 13, the VM set contains the unrefined MV of the current CTU row for the current CTU 1302 and the left column 1304, and the refined MV of the row
Petition 870190062098, of 7/3/2019, p. 72/121
49/59 upper CTU for the above 1306 line.
[00111] Regarding the second set of MV used for the reconstruction stage, the second set of MV can be used for motion compensation, OBMC and / or unlocking. The second set of MV contains (a) the refined MV of the current image, and (b) the refined MV of the other image. For example, referring to Figure 13, the VM set contains the refined MV of the current image for the current CTU 1302, the left column 1304 and the line above 1306.
[00112] The proposed method of multiple VMs / sets of VMs can also be applied in the encoder. For example, an individual unrefined VM set can be used at the syntax coding stage, MV derivation and / or MVP derivation (for example, merge candidate list generation and / or AMVP candidate list generation ). According to some examples, the MVs in the unrefined VM set are not refined by a decoder side MV refinement tool, and can be used for MV encoding and MVP generation. VMs refined by a decoder side VM refinement tool can be stored in another set of VM buffers. Refined VMs can be used for motion compensation, OBMC, unlocking and / or other tools that do not alter the analysis process according to the VMs.
[00113] Again, to recap, the decoder side MV refinement tools (for example, PMVD, DMVR and BIO) can change the MV of a block (for example, which can result in an analysis or pre-search for reference pixels as discussed above). In some embodiments, when storing the refined MV back, the
Petition 870190062098, of 7/3/2019, p. 73/121
50/59 difference between the refined VM and the initial VM can be restricted to a predefined threshold. For example, if the difference between the refined VM and the initial VM is greater than the predetermined threshold (for example, 4, 8 or 16 whole pixels apart), then the refined VM is first cut (for example, defined just below , or equal to, the threshold) and then stored as the cut MV. For example, the MV can be cut by the initial MV ± 4, 8 or 16 entire pixels. If the difference between the refined VM and the initial VM is less than this threshold, the refined VM can be stored directly.
[00114] The impact of a decoder side MV refinement tool by changing the MV of a block can be reduced by removing the pruning process between these refined MVs and other MVs in the MV / MVP derivation (for example, in the reconstruction merge candidate list or AMVP list reconstruction). For example, in some modalities, the pruning process used to remove redundancy between possible candidates is applied only to VMs that are not refined in the decoder. For candidates who can be refined in the decoder, refined MVs can be added directly to the candidate list without using the pruning process. In some embodiments, eliminating this pruning can be combined with the other techniques described above (for example, cutting refined MV and multiple MVs / MV sets) to further reduce the impact.
[00115] In some modalities, OBMC is applied in the reconstruction stage (for example, stage 606 in Figure 6). Therefore, two different techniques can be used to
Petition 870190062098, of 7/3/2019, p. 74/121
51/59 to OBMC (alone or in combination, such as using different techniques for sub-blocks along different edges). The first technique is to use the initial VM or the partial refined VM (for example, the unrefined VM) stored in the VM buffer for the OBMC. The second technique is to use the decoder side refined MV (for example, the refined MV) for the OBMC.
[00116] The techniques that operate according to the principles described in this specification can be implemented in any appropriate way. The processing and decision blocks of the flowcharts above represent steps and actions that can be included in the algorithms that execute these various processes. The algorithms derived from these processes can be implemented as software integrated with and directing the operation of one or more single-purpose or multiple-purpose processors, and can be implemented as functionally equivalent circuits, such as a Digital Signal Processing (DSP) circuit or a Circuit Integrated Specific Application (ASIC), or can be implemented in any other appropriate way. It should be appreciated that the flowcharts included in this specification do not describe the syntax or operation of any particular circuit or any particular programming language or type of programming language. Instead, the flowcharts illustrate the functional information that a person skilled in the art can use to manufacture circuits or to implement computer software algorithms to perform the processing of a particular device that performs the types of techniques described in that specification. It should also be appreciated that, unless otherwise indicated in that
Petition 870190062098, of 7/3/2019, p. 75/121
52/59 descriptive report, the particular sequence of steps and / or acts described in each flowchart is merely illustrative of the algorithms that can be implemented and can be varied in implementations and modalities of the principles described in that descriptive report.
[00117] Therefore, in some modalities, the techniques described in this specification can be incorporated into executable instructions by computer implemented as software, including as application software, system software, firmware, middleware, embedded code or any other type of code suitable computer. Such computer-executable instructions can be written using any of the various programming languages and / or suitable programming or script tools, and can also be compiled as executable machine language code or intermediate code executed in a work frame or virtual machine .
[00118] When the techniques described in this specification are incorporated as computer executable instructions, these computer executable instructions can be implemented in any appropriate manner, including as a number of functional installations, each providing one or more operations to complete the execution of algorithms that operate according to these techniques. A functional installation, however instantiated, is a structural component of a computer system that, when integrated with and executed by one or more computers, causes one or more computers to perform a specific operational function. A functional installation can be a portion or an entire software element. For example, an installation
Petition 870190062098, of 7/3/2019, p. 76/121
53/59 function can be implemented as a function of a process, or as a discrete process, or as any other suitable processing unit. If the techniques described in this specification are implemented as multiple functional installations, each functional installation can be implemented in its own way; they don't all need to be implemented in the same way. In addition, these functional installations can be performed in parallel and / or in series, as appropriate, and can pass information to each other using shared memory on the computer (s) on which they are running, using a message transmission protocol , or in any other appropriate manner.
[00119] Generally, functional installations include routines, programs, objects, components, data structures, etc., that perform specific tasks or implement certain types of abstract data. Normally, the functionality of functional facilities can be combined or distributed as desired in the systems in which they operate. In some implementations, one or more functional installations that perform the techniques described in this specification can together form a complete software package. These functional installations can, in alternative modalities, be adapted to interact with other functional installations and / or unrelated processes, to implement a software program application.
[00120] Some exemplary functional installations have been described in this specification for the performance of one or more tasks. It should be appreciated, however, that
Petition 870190062098, of 7/3/2019, p. 77/121
54/59 the functional facilities and the division of tasks described are merely illustrative of the type of functional facilities that can implement the exemplary techniques described in this specification, and that the modalities are not limited to being implemented in any number, division, or type of specific functional installations. In some implementations, all functionality can be implemented in a single functional installation. It should also be appreciated that, in some implementations, some of the functional facilities described in this specification may be implemented in conjunction with or separately from others (that is, as a single unit or separate units), or some of these functional facilities may not be implemented .
[00121] Computer executable instructions that implement the techniques described in this specification (when implemented as one or more functional installations or in any other way) can, in some modalities, be encoded in one or more computer-readable means to provide functionality in half. Computer-readable media include magnetic media, such as a hard disk drive, optical media, such as a compact disc (CD) or digital versatile disc (DVD), persistent or non-persistent solid-state memory (for example, Flash memory , Magnetic RAM, etc.) or any other suitable storage medium. Such a computer-readable medium can be implemented in any suitable manner. As used in this specification, computer-readable medium (also called computer-readable storage medium) refers to storage media
Petition 870190062098, of 7/3/2019, p. 78/121
55/59 tangible. Tangible storage media are non-transitory and have at least one physical, structural component. In a computer-readable medium, as used in this specification, at least one physical, structural component has at least one physical property that can be altered in some way during a process of creating the medium with embedded information, a process of recording information in it, or any other process of encoding the medium with information. For example, a state of magnetization of a portion of a physical structure of a computer-readable medium can be changed during a recording process.
[00122] In addition, some techniques described above comprise acts of storing information (for example, data and / or instructions) in certain ways for use by these techniques. In some implementations of these techniques - such as implementations in which the techniques are implemented as computer-executable instructions, the information can be encoded on a computer-readable storage medium. Where specific structures are described in this specification as advantageous formats for storing this information, these structures can be used to convey a physical organization of the information when encoded in the storage medium. These advantageous structures can then provide functionality to the storage medium, affecting the operations of one or more processors interacting with the information; for example, by increasing the efficiency of computer operations performed by the processor (s).
[00123] In some, but not all, implementations in
Petition 870190062098, of 7/3/2019, p. 79/121
56/59 which techniques can be incorporated as computer executable instructions, these instructions can be executed on one or more suitable computing devices operating on any suitable computer system, or one or more computing devices (or one or more computer processors) one or more computing devices) can be programmed to execute executable instructions per computer. A computing device or processor can be programmed to execute instructions when instructions are stored in a manner accessible to the computing device or processor, such as in a data store (for example, a cache on the chip or instruction recorder, a storage medium computer readable accessible via a bus, a computer readable storage medium accessible through one or more networks and accessible by the device / processor, etc.). Functional facilities comprising these computer-executable instructions can be integrated with and direct the operation of a single multipurpose programmable digital computing device, a coordinated system of two or more multipurpose computing device sharing processing devices and jointly execute the techniques described in this specification, a single computing device or coordinated computing device system (colocalized or geographically distributed) dedicated to performing the techniques described in that specification, one or more Programmable Field Gate Arrays (FPGAs) to perform the techniques described in that specification, or any other suitable system.
Petition 870190062098, of 7/3/2019, p. 80/121
57/59 [00124] A computing device may comprise at least one processor, a network adapter and a computer-readable storage medium. A computing device can be, for example, a personal desktop or laptop computer, a personal digital assistant (PDA), a smart mobile phone, a server or any other suitable computing device. A network adapter can be any suitable hardware and / or software to allow the computing device to communicate by cable and / or wireless with any other suitable computing device over any suitable computing network. The computer network may include access points, switches, routers, gateways and / or other wireless network equipment, as well as any wired and / or wireless communication medium suitable for exchanging data between two or more computers, including the Internet. The computer-readable medium can be adapted to store data to be processed and / or instructions to be executed by the processor. The processor allows data processing and instruction execution. The data and instructions can be stored on a computer-readable storage medium.
[00125] A computing device may additionally have one or more components and peripherals, including input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of
Petition 870190062098, of 7/3/2019, p. 81/121
58/59 exit. Examples of input devices that can be used for a user interface include keyboards and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computing device can receive input information through speech recognition or in another audible format.
[00126] Modalities have been described where the techniques are implemented in circuits and / or instructions executable by computer. It should be appreciated that some modalities may be in the form of a method, of which at least one example has been provided. Acts performed as part of the method can be requested in any appropriate manner. Consequently, modalities can be constructed in which the acts are performed in a different order from that illustrated, which may include performing some acts simultaneously, even if they are shown as sequential acts in illustrative modalities.
[00127] Various aspects of the modalities described above can be used alone, in combination, or in a variety of arrangements not specifically discussed in the modalities described in the preceding one, and therefore are not limited in their application to the details and arrangement of the components set out in the description above or illustrated in the drawings. For example, the aspects described in one modality can be combined in any way with aspects described in other modalities.
[00128] The use of ordinal terms such as first, second, third, etc., in claims to modify a claim element, does not in itself indicate any priority, precedence or order of an element of claim.
Petition 870190062098, of 7/3/2019, p. 82/121
59/59 claim over another or the temporal order in which the acts of a method are performed, but are used merely as labels to distinguish a claim element having a certain name from another element with the same name (but for use of the term ordinal ) to distinguish the claim elements.
[00129] In addition, the phraseology and terminology used in this specification are for the purpose of description and should not be considered as limiting. The use of including, comprising, having, containing, involving, and variations thereof in this specification is intended to encompass the items listed below and their equivalents, as well as additional items.
[00130] The word exemplary is used in this specification to mean serving as an example, instance or illustration. Any modality, implementation, process, resource, etc. described in that specification as an exemplary report should therefore be understood as an illustrative example and should not be understood as a preferred or advantageous example, unless otherwise specified.
[00131] Having thus described several aspects of at least one modality, it must be appreciated that various changes, modifications and improvements will take place promptly to the technicians in the subject. Such changes, modifications and improvements are intended to be part of this disclosure and are intended to be within the spirit and scope of the principles described in this specification. Therefore, the previous description and drawings are by way of example only.
权利要求:
Claims (26)
[1]
1. Decoding method for decoding video data, the method characterized by the fact that it comprises:
receiving compressed video data related to a frameset; and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame from the set of frames, where the new motion vector estimates motion for the current frame based on one or more frames of reference, comprising:
recover a first movement vector associated with the current frame;
performing a first portion of a decoding process using the first motion vector;
retrieving a second motion vector associated with the current frame other than the first motion vector; and performing a second portion of the decoding process using the second motion vector.
[2]
2. Decryption method according to the claim
1, characterized by the fact that:
the first vector of movement comprises one vector in unrefined movement; the second vector of movement comprises one vector in refined movement, in that the refined MV is refined
using a decoder side forecasting refinement technique;
the first portion of the decoding process comprises an analysis portion, a motion vector derivation portion, or both; and
Petition 870190062098, of 7/3/2019, p. 84/121
2/12 the second portion of the decoding process comprises a reconstruction portion.
[3]
3. Decryption method according to the claim
1, characterized by the fact that it also comprises:
recovering a third motion vector associated with a second frame, where the third motion vector is a refined motion vector;
performing the first portion of the decoding process using the first motion vector and the third motion vector; and performing the second portion of the decoding process using the second motion vector and the third motion vector.
[4]
4. Decoding method according to the claim
3, characterized by the fact that the execution of the first portion of the decoding process comprises executing a motion vector derivation portion using the first motion vector and the third motion vector, wherein the motion vector derivation portion comprises motion vector prediction derivation, merge candidate derivation, or both.
[5]
5. Decryption method according to the claim
4, characterized by the fact that the execution of the first portion of the decoding process comprises referring to the first motion vector as a decoded motion vector of the current frame.
[6]
6. Decryption method according to the claim
3, characterized by the fact that it also comprises the use of the second motion vector and the third motion vector to perform motion compensation,
Petition 870190062098, of 7/3/2019, p. 85/121
3/12 compensation for overlapping block movement, unlocking or any combination thereof.
[7]
7. Decoding method, according to claim 1, characterized by the fact that it also comprises:
Determining an encoding tree unit constraint that is not applied to compressed video data; and recovery of the first movement vector associated with the current frame, which includes recovering:
an unrefined motion vector of the current frame; and a refined motion vector associated with a second frame.
[8]
8. Decoding method, according to claim 1, characterized by the fact that the recovery of the first motion vector associated with the current frame comprises recovering:
an unrefined motion vector from a current coding tree unit line;
a refined motion vector of a top coding tree unit line, another piece or other slice; and a refined motion vector associated with a second frame.
[9]
9. Decoding method for decoding video data, the method characterized by the fact that it comprises:
receiving compressed video data related to a frameset; and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame from the frameset, where
Petition 870190062098, of 7/3/2019, p. 86/121
4/12 the new motion vector estimates the motion for the current frame based on one or more reference frames, comprising:
receiving a signal indicating an initial candidate index for a list of initial motion vector candidates;
determining a first motion vector candidate in the initial motion vector candidate list and a second motion vector candidate comprises a difference which is below a predetermined threshold;
removing the second motion vector candidate from the initial motion vector candidate list, not adding the second motion vector candidate to the initial motion vector candidate list, or both; and calculate the new motion vector based on the candidate list and the initial candidate index.
[10]
10. Decryption method, according to claim 9, characterized by the fact that it further comprises:
analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors;
determine, based on the analysis, that the pair of motion vectors is along the same movement path; and add the motion vector pair to the initial motion vector candidate list.
[11]
11. Decoding method, according to claim 9, characterized by the fact that it comprises
Petition 870190062098, of 7/3/2019, p. 87/121
5/12 still:
analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors;
determine, based on the analysis, that the pair of motion vectors is not along the same movement path;
separating the motion vector pair into two new candidate motion vector pairs; and add the two candidate motion vectors to the initial motion vector candidate list.
[12]
12. Method of decoding, according to claim 11, characterized by the fact that the separation comprises:
adding the first motion vector of the motion vector pair to a first of the two new candidate motion vector pairs;
fill the first of the two new pairs of candidate motion vectors with a mirrored motion vector from the first motion vector;
adding the second motion vector of the motion vector pair to a second of the two new candidate motion vector pairs; and filling the second of the two new pairs of candidate motion vectors with a mirrored motion vector from the second motion vector.
[13]
13. Encoding method for encoding video data, the method characterized by the fact that it comprises:
calculate compressed data data related to a set of frames, comprising calculating a new vector of
Petition 870190062098, of 7/3/2019, p. 88/121
6/12 movement to a current frame from the frameset, where the new motion vector estimates the movement
to the current frame with base in one or more frames in reference, comprising: calculate a first vector in movement associated to current picture r run a first portion of process coding using the first motion vector;calculate one second vector in movement associated to
current frame that is different from the first motion vector; and performing a second portion of the encoding process using the second motion vector.
[14]
14. Coding method, according to claim
13, characterized by the fact that:
calculating the first motion vector comprises calculating an unrefined motion vector, a set of unrefined motion vectors, or both; and performing the first portion of the encoding process comprises performing a syntax encoding portion, a motion vector derivation portion, a motion vector prediction derivation portion or some combination thereof.
[15]
15. Coding method, according to claim
14, characterized by the fact that the execution of the motion vector forecast derivation portion comprises generating a list of merge candidates, generating a list of advanced motion vector forecast candidates, or both.
[16]
16. Coding method, according to claim
Petition 870190062098, of 7/3/2019, p. 89/121
7/12
14, characterized by the fact that it also comprises performing motion vector coding, generating a motion vector forecast or both, using the unrefined motion vector, the unrefined motion vector set, or both, such that the vector of unrefined motion, the unrefined motion vector set, or both are not refined using a decoder side motion vector refinement tool.
[17]
17. Coding method, according to claim
13, characterized by the fact that:
calculating the second motion vector comprises calculating a refined motion vector, wherein the refined motion vector is calculated using an encoder side refinement technique;
store the refined motion vector in a motion vector buffer set; and performing the second portion of the coding process comprises performing a motion compensation portion, an overlapping block movement compensation portion, an unlocking portion or some combination thereof.
[18]
18. Device configured to decode video data, the device characterized by the fact that it comprises a processor in communication with memory, the processor being configured to execute instructions stored in memory that make the processor:
receiving compressed video data related to a frameset; and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame from the frameset, where
Petition 870190062098, of 7/3/2019, p. 90/121
8/12 the new motion vector estimates motion for the current frame based on one or more reference frames, comprising:
recover a first movement vector associated with the current frame;
performing a first portion of a decoding process using the first motion vector;
retrieving a second motion vector associated with the current frame other than the first motion vector; and performing a second portion of the decoding process using the second motion vector.
[19]
19. Apparatus, according to claim 18, characterized by the fact that:
the first vector of movement comprises one vector in unrefined movement; the second vector of movement comprises one vector in refined movement, in that the refined MV is refined
using a decoder side predictor refinement technique;
the first portion of the decoding process comprises an analysis portion, a motion vector derivation portion, or both; and the second portion of the decoding process comprises a reconstruction portion.
[20]
20. Apparatus, according to claim 18, characterized by the fact that the processor is configured to execute instructions stored in memory that make the processor:
retrieve a third motion vector associated with a second frame, where the third motion vector is a
Petition 870190062098, of 7/3/2019, p. 91/121
9/12 refined motion vector;
performing the first portion of the decoding process using the first motion vector and the third motion vector; and performing the second portion of the decoding process using the second motion vector and the third motion vector.
[21]
21. Device configured to decode video data, the device characterized by the fact that it comprises a processor in communication with memory, the processor being configured to execute instructions stored in memory that make the processor:
receiving compressed video data related to a set of frames; and calculate, using a decoder side predictor refinement technique, a new motion vector for a current frame from the set of frames, where the new motion vector estimates motion for the current frame based on one or more frames of reference, comprising:
receiving a signal indicating an initial candidate index for a list of initial motion vector candidates;
determining a first motion vector candidate in the initial motion vector candidate list and a second motion vector candidate comprises a difference which is below a predetermined threshold;
remove the second motion vector candidate from the initial motion vector candidate list, do not add the second motion vector candidate to the
Petition 870190062098, of 7/3/2019, p. 92/121
10/12 list of initial motion vector candidates or both; and calculate the new motion vector based on the candidate list and the initial candidate index.
[22]
22. Apparatus, according to claim 21, characterized by the fact that the processor is configured to execute instructions stored in memory that make the processor:
analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors;
determine, based on the analysis, that the pair of motion vectors is along the same movement path; and add the motion vector pair to the initial motion vector candidate list.
[23]
23. Apparatus, according to claim 21, characterized by the fact that the processor is configured to execute instructions stored in memory that make the processor:
analyzing a new motion vector candidate, the motion vector candidate comprising a pair of motion vectors;
determine, based on the analysis, that the pair of motion vectors is not along the same movement path;
separating the motion vector pair into two new candidate motion vector pairs; and add the two candidate motion vectors to the initial motion vector candidate list.
Petition 870190062098, of 7/3/2019, p. 93/121
12/11
[24]
24. Device configured to encode video data, the device characterized by the fact that it comprises a processor in communication with memory, the processor being configured to execute instructions stored in memory that make the processor:
calculate compressed video data related to a frameset, comprising calculating a new motion vector for a current frame from the frameset, where the new motion vector estimates motion
to the current frame with base in one or more frames in reference, comprising: calculate a first vector in movement associated to current picture r run a first portion of process coding using the first motion vector;calculate one second vector in movement associated to
current frame that is different from the first motion vector; and performing a second portion of the encoding process using the second motion vector.
[25]
25. Apparatus according to claim 24, characterized by the fact that:
calculating the first motion vector comprises calculating an unrefined motion vector, a set of unrefined motion vectors, or both; and performing the first portion of the encoding process comprises performing a syntax encoding portion, a motion vector derivation portion, a motion vector prediction derivation portion or some combination thereof.
Petition 870190062098, of 7/3/2019, p. 94/121
12/12
[26]
26. Apparatus according to claim 24, characterized by the fact that:
calculating the second motion vector comprises calculating a refined motion vector, wherein the refined motion vector is calculated using an encoder side refinement technique;
store the refined motion vector in a motion vector buffer set; and performing the second portion of the coding process comprises performing a motion compensation portion, an overlapping block movement compensation portion, an unlocking portion or some combination thereof.
类似技术:
公开号 | 公开日 | 专利标题
BR112019013832A2|2020-01-28|decoder side motion vector restoration for video encoding
CN109565590B|2021-05-07|Model-based motion vector derivation for video coding
US10602180B2|2020-03-24|Motion vector prediction
TWI717586B|2021-02-01|Deriving motion vector information at a video decoder
TWI714565B|2021-01-01|Motion vector derivation in video coding
TWI669951B|2019-08-21|Multi-hypotheses merge mode
KR20170125086A|2017-11-13|Image prediction method and related apparatus
KR102344430B1|2021-12-27|Motion vector improvement for multi-reference prediction
TW202025725A|2020-07-01|Sub-block mv inheritance between color components
BR112020015246A2|2021-01-26|accessible hardware restricted motion vector refinement
WO2020103946A1|2020-05-28|Signaling for multi-reference line prediction and multi-hypothesis prediction
TWI711300B|2020-11-21|Signaling for illumination compensation
TW202013974A|2020-04-01|Methods and apparatus for encoding/decoding video data
JP2021506180A|2021-02-18|Video data inter-prediction method and equipment
US20190268611A1|2019-08-29|Intelligent Mode Assignment In Video Coding
KR20200128036A|2020-11-11|Improvements to advanced temporal motion vector prediction
BR112020003938A2|2020-09-08|motion compensation at a finer precision than the motion vector differential
BR112021001563A2|2021-04-20|method and inter prediction apparatus
WO2020075053A1|2020-04-16|Generation and usage of combined affine merge candidate
同族专利:
公开号 | 公开日
EP3545682A4|2020-04-29|
EP3545682A1|2019-10-02|
WO2018127119A1|2018-07-12|
TW201841505A|2018-11-16|
CN110169073A|2019-08-23|
TWI677238B|2019-11-11|
US20180192071A1|2018-07-05|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

BRPI0918478A2|2008-09-04|2015-12-01|Thomson Licensing|methods and apparatus for prediction refinement using implicit motion prediction|
US9060176B2|2009-10-01|2015-06-16|Ntt Docomo, Inc.|Motion vector prediction in video coding|
WO2012045225A1|2010-10-06|2012-04-12|Intel Corporation|System and method for low complexity motion vector derivation|
CN102710934B|2011-01-22|2015-05-06|华为技术有限公司|Motion predicting or compensating method|
CN102611886A|2011-01-22|2012-07-25|华为技术有限公司|Method for predicting or compensating motion|
KR102269655B1|2012-02-04|2021-06-25|엘지전자 주식회사|Video encoding method, video decoding method, and device using same|
WO2015006951A1|2013-07-18|2015-01-22|Mediatek Singapore Pte. Ltd.|Methods for fast encoder decision|US10523964B2|2017-03-13|2019-12-31|Qualcomm Incorporated|Inter prediction refinement based on bi-directional optical flow |
US10904565B2|2017-06-23|2021-01-26|Qualcomm Incorporated|Memory-bandwidth-efficient design for bi-directional optical flow |
US10798402B2|2017-10-24|2020-10-06|Google Llc|Same frame motion estimation and compensation|
US10812810B2|2018-02-06|2020-10-20|Tencent America LLC|Method and apparatus for video coding in merge mode|
US10958928B2|2018-04-10|2021-03-23|Qualcomm Incorporated|Decoder-side motion vector derivation for video coding|
KR20200124755A|2018-04-13|2020-11-03|엘지전자 주식회사|Inter prediction method and apparatus in video processing system|
US10863190B2|2018-06-14|2020-12-08|Tencent America LLC|Techniques for memory bandwidth optimization in bi-predicted motion vector refinement|
JP2021530936A|2018-06-29|2021-11-11|北京字節跳動網絡技術有限公司Beijing Bytedance Network Technology Co., Ltd.|Look-up table updates: FIFO, restricted FIFO|
CN110662064A|2018-06-29|2020-01-07|北京字节跳动网络技术有限公司|Checking order of motion candidates in LUT|
WO2020003282A1|2018-06-29|2020-01-02|Beijing Bytedance Network Technology Co., Ltd.|Managing motion vector predictors for video coding|
EP3794824A1|2018-06-29|2021-03-24|Beijing Bytedance Network Technology Co. Ltd.|Conditions for updating luts|
GB2588528A|2018-06-29|2021-04-28|Beijing Bytedance Network Tech Co Ltd|Selection of coded motion information for LUT updating|
TWI735902B|2018-07-02|2021-08-11|大陸商北京字節跳動網絡技術有限公司|Lookup table with intra frame prediction and intra frame predication from non adjacent blocks|
US10638153B2|2018-07-02|2020-04-28|Tencent America LLC|For decoder side MV derivation and refinement|
US10701384B2|2018-08-01|2020-06-30|Tencent America LLC|Method and apparatus for improvement on decoder side motion derivation and refinement|
CN110809155A|2018-08-04|2020-02-18|北京字节跳动网络技术有限公司|Restriction using updated motion information|
US11184635B2|2018-08-31|2021-11-23|Tencent America LLC|Method and apparatus for video coding with motion vector constraints|
TW202025760A|2018-09-12|2020-07-01|大陸商北京字節跳動網絡技術有限公司|How many hmvp candidates to be checked|
WO2020057524A1|2018-09-19|2020-03-26|Huawei Technologies Co., Ltd.|Method for skipping refinement based on patch similarity in bilinear interpolation based decoder-side motion vector refinement|
WO2020073928A1|2018-10-09|2020-04-16|Huawei Technologies Co., Ltd.|Inter prediction method and apparatus|
US11146810B2|2018-11-27|2021-10-12|Qualcomm Incorporated|Decoder-side motion vector refinement|
WO2020141913A1|2019-01-01|2020-07-09|엘지전자 주식회사|Method and apparatus for processing video signal on basis of inter prediction|
WO2020141912A1|2019-01-01|2020-07-09|엘지전자 주식회사|Method and apparatus for processing video signal on basis of inter prediction|
KR20210094664A|2019-01-02|2021-07-29|텔레폰악티에볼라겟엘엠에릭슨|Side-motion refinement in video encoding/decoding systems|
CN113383554A|2019-01-13|2021-09-10|北京字节跳动网络技术有限公司|Interaction between LUTs and shared Merge lists|
WO2020164580A1|2019-02-14|2020-08-20|Beijing Bytedance Network Technology Co., Ltd.|Size selective application of decoder side refining tools|
CN113545069A|2019-03-03|2021-10-22|北京字节跳动网络技术有限公司|Motion vector management for decoder-side motion vector refinement|
CN113597759A|2019-03-11|2021-11-02|北京字节跳动网络技术有限公司|Motion vector refinement in video coding and decoding|
CN109803175B|2019-03-12|2021-03-26|京东方科技集团股份有限公司|Video processing method and device, video processing equipment and storage medium|
US11172212B2|2019-06-06|2021-11-09|Qualcomm Incorporated|Decoder-side refinement tool on/off control|
CN110460859A|2019-08-21|2019-11-15|浙江大华技术股份有限公司|Application method, codec and the storage device of historical movement vector list|
WO2021068955A1|2019-10-12|2021-04-15|Beijing Bytedance Network Technology Co., Ltd.|Use and signaling of refining video coding tools|
法律状态:
2021-10-13| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201762442472P| true| 2017-01-05|2017-01-05|
US201762479350P| true| 2017-03-31|2017-03-31|
US15/861,476|US20180192071A1|2017-01-05|2018-01-03|Decoder-side motion vector restoration for video coding|
PCT/CN2018/071518|WO2018127119A1|2017-01-05|2018-01-05|Decoder-side motion vector restoration for video coding|
[返回顶部]